A Lexicalist Approach to the Translation of Colloquial Text

نویسندگان

  • Fred Popowich
  • Davide Turcato
  • Olivier Laurens
  • Paul McFetridge
  • Devlan Nicholson
  • Patrick McGivern
  • Maricela Corzo Pena
  • Lisa Pidruchney
  • Scott McDonald
چکیده

Colloquial English (CE) as found in television programs or typical conversations is different than text found in technical manuals, newspapers and books. Phrases tend to be shorter and less sophisticated. In this paper, we look at some of the theoretical and implementational issues involved in translating CE. We present a fully automatic large-scale multilingual natural language processing system for translation of CE input text, as found in the commercially transmitted closed-caption television signal, into simple target sentences. Our approach is based on the Whitelock’s Shake and Bake machine translation paradigm, which relies heavily on lexical resources. The system currently translates from English to Spanish with the translation modules for Brazilian Portuguese under development.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

English-Persian Plagiarism Detection based on a Semantic Approach

Plagiarism which is defined as “the wrongful appropriation of other writers’ or authors’ works and ideas without citing or informing them” poses a major challenge to knowledge spread publication. Plagiarism has been placed in four categories of direct, paraphrasing (rewriting), translation, and combinatory. This paper addresses translational plagiarism which is sometimes referred to as cross-li...

متن کامل

A Unified Example-Based and Lexicalist Approach to Machine Translation

We propose an approach to Machine Translation that combines the ideas and methodologies of the Example-Based and Lexicalist theoretical frameworks. The approach has been implemented in a multilingual Machine Translation system.

متن کامل

A Sociolinguistic Scrutiny of the Great Gatsby and its Persian Translation in Light of Hatim and Mason’s Framework

Translation studies essentially deals with a socio-communicatively driven and contextualized enterprise. Viewed hence, it seems that no discipline tends to provide the possibility of studying the interrelations between interlocutors to generate meaning within the interactive social context as precisely as sociolinguistics (Federici, 2018). A sociolinguistic approach to translation seems to be i...

متن کامل

Transforming Standard Arabic to Colloquial Arabic

We present a method for generating Colloquial Egyptian Arabic (CEA) from morphologically disambiguated Modern Standard Arabic (MSA). When used in POS tagging, this process improves the accuracy from 73.24% to 86.84% on unseen CEA text, and reduces the percentage of out-ofvocabulary words from 28.98% to 16.66%. The process holds promise for any NLP task targeting the dialectal varieties of Arabi...

متن کامل

An Efficient Generation Algorithm for Lexicalist MT

The lexicalist approach to Machine Translation offers significant advantages in the development of linguistic descriptions. However, the Shake-and-Bake generation algorithm of (Whitelock, 1992) is NPcomplete. We present a polynomial time algorithm for lexicalist MT generation provided that sufficient information can be transferred to ensure more determinism.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • CoRR

دوره cmp-lg/9706024  شماره 

صفحات  -

تاریخ انتشار 1997